Accuracy of the estimation of computer resource usage

ABSTRACT

The present invention provides a method and system ( 1 ) for improving the accuracy of an estimate of computing system resource usage, the method comprising the steps of obtaining utilization data of system resource and first transaction count data, obtaining further transaction count data and processing the transaction count data and further transaction data to provide an improved estimate of the number of transactions executed during a given time interval.

FIELD OF THE INVENTION

The present invention relates to a system and method for estimatingcomputer resource usage and specifically, but not exclusively, to asystem and method for improving the accuracy of the estimation ofcomputer resource usage by transaction types for transaction processingsystems.

BACKGROUND OF THE INVENTION

Resource usage estimation is becoming critical to modern computingsystems. The advent of sophisticated multi-tasking and multi-threadingoperating systems and applications has allowed many transaction types tobe executed concurrently on a single computing system.

A computing system may execute many transactions during a normal “day”.In a computing system, transactions may be grouped into subsets termedtransaction types. These transaction types refer to functions orprocedures carried out by the computer system. For example, there may bea function that calculates the stock level of a particular item, whichmay be designated by a name such as “stock-level”. In another example,there may be provided a function that generates a new order, and may bedesignated by a name such as “new-order”. Transactions belonging to thesame type will usually have similar processing profiles. That is,transactions belonging to the same type will usually use a similarproportion of system resources. Information on the usage of computerresources by given transaction type is important. It allows a programmeror system administrator to determine the main causes of system resourceconsumption and thereby attempt to optimise certain transaction types,or to optimise hardware and/or software components of the system. Suchoptimisation preferably results in an improvement in overall efficiency.

In many computer systems that process transactions, there is generallyprovided a log to which transaction processing information is written.The log generally contains both individual transaction data and summarydata, which is written to the log at the conclusion of a defined timeinterval. Such logs can be used to estimate resource usage bytransaction types. A user, operator or administrator may, for example,wish to ascertain how much CPU time an average transaction uses. It willbe understood that operating systems also provide data such as processorand disk utilisation statistics (commonly expressed as the percentage ofthe resource used at any given time interval).

The standard method for estimating CPU usage per transaction for a givenperiod of time would be to select the time period, find the CPU usage inthat period, find the number of transactions in the log for that period,and divide the CPU usage by the number of transactions. This provides asimple value of the amount of time used by the CPU to process atransaction.

The applicant has found that the usefulness of information computed bythis simplistic method is low for a number of reasons.

Firstly, a transaction may take more than one time interval to execute.However, the transaction is only “counted” in the time interval whereprocessing of the transaction was finalised.

In other words, during any given time interval, there are potentiallytwo sources of inaccuracy due to transactions crossing the intervalboundary:

1. transactions which begin execution during a previous time interval(that is, at least a portion of their CPU usage occurs in the precedingtime interval—yet they are counted against the current time interval).

2. transactions which begin execution during the current time interval,but do not finish in the current time interval (these transactionsutilise some CPU in the current time interval but they are not countedin the current time interval).

These inaccuracies will balance out over a long time period (i.e. over alarge number of time interval samples). However, these inaccuracies willdistort the instantaneous compuated CPU usage/transaction count ration.Additionally, there are many situations where data cannot be collectedover a long time period. For example, in “live run” or “online”computing systems, a peak of user activity may only occur for a shortdefined time, or activity may be erratic.

The above prior art method does not deal with individual transactiontypes (e.g. “stock-level” and “new-order”). However, differenttransaction types may require different quantities of computerresources. For example, the time taken to execute the transaction“stock-level” may be different from the time taken to execute thetransaction “new-order”. It would be useful to determine resourcerequirements for each transaction type.

The problem outlined above was identified by the applicant in a previouspatent application, PCT 09/110,000, filed Mar. 14, 2002 in the UnitedStates Patent Office, which is incorporated herein by reference. Theprevious patent application discloses a method for determining anestimate of computer resource usage for different transaction types bycollecting suitable statistical data and applying a least squaresalgorithm to the statistical data to provide an estimate of theresources used by an individual transaction type in a transaction mix,which preferably provides a solution to this problem. However, thismethod does not address the problem of concurrently accounting fortransactions which cross the interval boundary, as discussed above.

SUMMARY OF THE INVENTION

In a first aspect, the present invention provides a method of improvingthe accuracy of an estimate of computing system resource usage,comprising the steps of, obtaining utilisation data of a systemresource, obtaining first transaction count data, wherein the firsttransaction count data provides an indication of the number oftransactions executed in a given time interval, obtaining furthertransaction count data, wherein the further transaction count datacontains additional information relating to the execution time of atransaction, and processing the transaction count data and the furthertransaction count data, wherein the processed data provides an improvedestimate of the number of transactions executed during a given timeinterval.

To increase the accuracy of the estimate of CPU usage/transaction countratio, the applicants, in at least a preferred embodiment, take intoaccount the timeline of transactions that cross the interval boundary ofa time interval.

The present invention preferably provides for a more accurate estimationof resource usage for individual transaction types by collecting furtherinformation with regard to the time interval during which a transactionis executed. It will be understood that the term “execution” refers tothe time interval during which a transaction is utilising computingresources.

Similarly, the term “executed” will be understood to refer to atransaction which has finished execution (i.e. the transaction is nolonger utilising computing resources).

The applicant has determined that the prior art approach in thereferenced PCT 09/110,000 may be inaccurate in certain situations sincethe execution time for a transaction may span two or more timeintervals, yet the transaction is generally only logged or “counted”when the execution is finished, creating the impression that thetransaction executed in only one time period. The applicant proposes thecollection of further transaction count data to more accuratelydetermine the true execution time of a particular transaction.

Preferably, in a first embodiment, the further transaction count datacomprises a data set containing a count of the total number oftransactions which have not finished execution within a given timeinterval.

In the first embodiment, where the first transaction count data recordsthe time interval in which a transaction finished executing, the furthertransaction count data comprises a data set containing a count of thetotal number of transactions which are currently being processed withinthe given time interval, but have not finished execution during thattime interval (i.e. the transaction has not finished processing at thetime that the “snapshot” is taken).

The further transaction count data is collected, in the firstembodiment, by providing a further transaction count mechanism in theform of a counter for each transaction type. When a snapshot is taken(i.e. at the end of a time interval), the counter for each transactiontype will be incremented by one unit for each transaction of that typewhich has not finished execution.

This provides further data (in conjunction with the first transactiondata), of the time interval/s in which a transaction is executing. Thus,if a transaction begins execution in one time interval, but finishesexecution in another time interval, the counter will log or count thisdiscrepancy, and the information gathered is used to adjust the“processed” data accordingly. This method has the advantage that it ischeap to implement computationally (ie. it imposes only a minor burdenon computing resources). This mechanism can be implemented in“real-time” without adversely affecting system load. Thus, with thefirst embodiment of the invention, it is possible to improve theaccuracy of the transaction count data sourced from run-time systemsworking in real life environments.

Preferably, processing includes the further step of allocating the countof the total number of transactions, by an appropriate proportion,between an adjacent time interval and the given time interval.

Preferably, the appropriate proportion is 0.5.

In an application of the first embodiment of the present invention, thecount data for each transaction type is allocated across two adjacenttime intervals. That is, it is assumed that, for each transaction inprocessing at the moment of the snapshot, approximately half of theresources used to process the transaction are allocated to the giventime interval and half to the adjacent time interval.

In other words, 0.5 counts is allocated to the transaction count inwhich execution of the transaction began, and 0.5 counts is allocated tothe given time interval, in which the transaction was completed.

In a second embodiment, the further transaction count data comprises adata set containing the actual start time and the actual finish time foreach transaction.

Preferably, the data set is processed to determine a proportion of timeexpended by a transaction within the given time interval and an adjacenttime interval.

In a situation where system load is not a critical factor, a moreaccurate measurement mechanism may be used. In this second embodiment,every transaction start time and finish time is logged or counted,thereby allowing an operator to determine the exact proportion of atransaction that should be allocated to a particular time interval.

This can be contrasted with the first embodiment, where the counter onlyrecords the occurrence of an event within a given time interval. (ie.the start of a transaction). However, it does not record the time atwhich the transaction began. Therefore, whilst the first embodimentrecords the fact that a transaction has spanned two time intervals, itprovides no information on how this time should be divided between thetwo intervals. The second embodiment, by tracking the exact start timeand finish time of every transaction, allows the user to calculate theproportion of each transaction time that should be allocated torespective time intervals.

In a third embodiment, the further transaction data comprises a data setobtained by calculating the average transaction processing time for agiven transaction type, and using the average transaction processingtime to derive an estimate of the transaction time to be allocated to anindividual transaction within a given time interval. The thirdembodiment collects, for each transaction, further transaction data thatincludes the start time of that transaction and for each transactiontype, the average response time. The average response time is calculatedby collecting the sum of the actual response times of a particulartransaction type for a large number of events, and then calculating theaverage response time from this information. At the snapshot time (ie.the time at which the resource estimate is computed), the average timeused by the transactions in processing for each transaction type iscomputed and the current average response time for this transaction typeis determined.

Finally, the two values are divided to obtain the fraction of thetransaction already executed. The fraction of the response time is thenallocated across the two time intervals.

The third embodiment preferably improves the accuracy of the estimationof total resource usage for individual transaction types. It uses adifferent approach for estimating the effects of transactions whoseprocessing time spans the interval time. Operationally andimplementationally the cost of the third embodiment falls between thefirst and the second embodiments of the present invention. The accuracyof the third embodiment also falls between the accuracy of the first andsecond embodiments. This intermediate method can be used for “online”systems as it only moderately impacts on total system load.

Preferably, the above method comprises the further step of applying amathematical model to the estimate of the number of transactions toprovide an estimate of resource usage for individual transaction typeswithin the computing environment.

In accordance with any embodiment of the present invention, the methoddisclosed in this document may be applied to further improve the resultsgiven by a method such as the one outlined in PCT 09/110,000, filed Mar.14, 2002, in the U.S. Patent Office.

In PCT 09/110,000, there is disclosed a method for estimating theresource usage by individual transaction types, by collecting datarelating to the number of transactions executed within a given timeinterval, and applying a least squares algorithm to isolate theresources used by an individual transaction type. The present inventionmay be used to further improve the accuracy of the resultant dataproduced by the invention disclosed in the above-mentioned application.

In a second aspect, the present invention provides a computing systemarranged to facilitate the estimation of resource usage within acomputer environment, comprising a data gathering means arranged togather utilisation data of a computer resource and first transactioncount data, wherein the first transaction count data provides anindication of the number of transactions executed in a given timeinterval, further data gathering means arranged to gather furthertransaction count data, wherein the further transaction count datacontains additional information relating to the execution time of atransaction, and processing means arranged to process the firsttransaction count data and the further transaction count data, wherebythe processed data provides an improved estimate of the number oftransactions executed during a given time interval.

In a third aspect, the present invention provides a computer programarranged, when loaded on a computing system, to implement a method inaccordance with the first aspect of the invention.

In accordance with a fourth aspect of the present invention, there isprovided a computer readable medium providing a computer program inaccordance with the third aspect of the present invention.

DETAILED DESCRIPTION OF THE DRAWINGS

Features and advantages of the present invention will become apparentfrom the following description of an embodiment thereof, by way ofexample only, with reference to the accompanying drawings, in which;

FIG. 1 illustrates a system for implementation of an embodiment of thepresent invention;

FIG. 2 is a time line diagram illustrating an example occurrence oftransactions in relation to time intervals.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention broadly relates to a system and a correspondingmethod that preferably allows for an improved accuracy in the estimationof computer resource usage by transaction types for a transactionprocessing system.

In particular, an embodiment of the present invention applies to anEnterprise Application Environment, which may be implemented on aWindows™ or Unix™ platform. The term Enterprise Application Environmentwill be understood to mean a proprietary type of operating environmentdeveloped by Unisys Corporation and arranged to process many userrequests at once. Moreover, other embodiments could equally be appliedto any generic computing system arranged to process a large number ofsimultaneous user requests, such as a database application server, or aweb server. It will also be understood that the present invention may beused in any computing system, whether the computing system consists of asingle processor, or multiple processors or any other combination ofhardware components, such as single or multiple storage drives.

In the following paragraphs, a simple example is used to illustrate theproblem addressed by at least one embodiment of the present invention.

In many computer systems which process simultaneous user requests, thereis generally provided a log to which transaction processing informationis written. To minimise the logging overhead, the logging function willonly write to the log file a single total of transaction counts for agiven time interval (for example, at 5 second or 30 second interval). Asample log appears in Table I: TABLE I A sample log from a transactionprocessing system 05:40:00 === total tx count 5 05:40:14 TxPayment(terminal 01) 05:40:33 TxCustomerInquiry (terminal 08) 05:40:58TxStockLevel (terminal 03) 05:41:00 === total tx count 8 05:41:03TxStockLevel (terminal 05) 05:41:34 TxPayment (terminal 08) 05:41:45TxCustomerStatus (terminal 02) 05:42:00 === total tx count 11 05:42:05TxDeliveryThis log contains both individual transaction data (in the table, theletters ‘tx’ are an abbreviation for the word ‘transaction’) and summarydata that is written to the log at the conclusion of each time interval.Such logs can be used to estimate resource usage. A user, operator oradministrator may wish to ascertain how much, say, CPU time an averagetransaction uses. It will be understood that operating systems alsoprovide data such as processor and disk utilisation statistics (commonlyexpressed as the percentage of the resource used at any given timeinterval). An embodiment of the present invention may be equally appliedto measure any such statistic relating to any computer system resource.

In the prior art, the standard method for estimating, say, CPU resourceusage per transaction for given period of time is:

-   -   Select time period: from 05:41:00 to 05:42:00    -   Determine the CPU resource usage in that period (say, 10 seconds        of CPU time was used)    -   Determine the number of transactions in the log for that period        (in our example: 3)    -   Divide the CPU resource usage by the number of transactions:        10/3=3.333    -   Therefore, average CPU/transaction is 3.333 seconds

The usefulness of information computed by this simplistic method is lowfor an important reason. A transaction may take more than one timeinterval to execute, yet, the transaction is only “counted” in theinterval in which processing of the transaction was finalised. Forexample, the transaction which finishes at 05:41:03 (TxStockLevel(terminal 05) in Table I) could have started execution at 05:40:50. Insuch a situation, a large proportion of the transaction execution (andtherefore CPU resource usage) would have occurred in the preceding timeinterval.

Therefore, during any given time interval, there are two potentialsources of inaccuracy:

-   -   transactions which begin execution during a previous time        interval (that is, a least a portion of the CPU resource usage        occurs in the preceding time interval—yet the transaction is        only counted in the current time interval).    -   transactions which begin during the current time interval, but        do not finish in the current time interval (these transactions        utilise some CPU resources in the current time interval but they        are not counted in the current time interval).

Such inaccuracies will balance out over a long time interval (i.e. overa large number of time interval samples). However, these inaccuracieswill distort the computed CPU usage/transaction count ratio.

A method commonly used to reduce this problem is to compute the averageCPU/transaction ratio for an entire test “set” of data (or to use muchlarger interval times to reduce the number of transaction which cross aninterval boundary). These approaches yield more accurate information,and are sufficient. For example, when the user is only interested ingross, ball-park figures. However, there are at least two situationswhere this approach does not yield satisfactory results.

Firstly, this approach is not satisfactory where precise data on thevariability of CPU/transaction ratio with time and/or load is needed.Most computer systems require more CPU time to process the sametransaction under high load than under low load (this is caused by theextra processing used by locking, back-off algorithms, queue management,etc).

Secondly, this approach is not satisfactory where estimates of CPU timeused by individual transaction types is needed. This is due to the factthat in a multi-user, multi-tasking and multi-threaded computing system,several transactions (each transaction potentially being a differenttransaction type) will be processed concurrently, such that the truetime taken by a transaction of a particular type may be masked ordistorted by the concurrent processing of other different transactiontypes. Thus, the prior art only allows a user to calculate an averagetransaction time for a mixed “basket” of transactions, and not for eachindividual transaction type. This is disadvantageous, since differenttransaction types will use different resources.

It will be understood that the preceding description refers to “CPUusage” (or “CPU utilisation”) as an example of a computer resource. Inthe context of the present invention, the phrase “CPU utilisation” willbe understood to mean a value which represents a quantitativemeasurement of the CPU resources used by any transaction or actionperformed by an operating system or other software application. The useof a “CPU resource” could include, by way of example only, the loadingof variables into the CPU register, the performing of arithmeticfunctions by the CPU, the flushing of on-board CPU cache, or any otherfunction which is performed exclusively by the CPU and prevents othertransactions from accessing or using the CPU.

The utilisation value could also represent any appropriate hardware orsoftware resource, such as individual processes or functions within alarger application, or different applications residing concurrently on acomputing system. It will be understood that any system resource may bemeasured, such as input/output parameters, number of context switches,network usage, etc.

FIG. 2 illustrates an example timeline of transaction execution and datacollection. The horizontal axis denotes time, and each arrow in thediagram denotes an individual transaction, the start and end point ofeach arrow denoting the start and end of the execution time of eachtransaction. The vertical lines denote the time at which each “snapshot”is taken.

The most natural (frequently the only) method for collecting data in anycomputer system is to increment a transaction counter each time atransaction finishes. This results in a situation where, for example,the method in the applicants previous application PCT 09/110,000 countsall the transactions which finish in a given time interval as if theywere executed during the time interval.

However, this methodology is not always correct. This inaccuracy isillustrated by the transactions marked tx_(a) and tx_(b), at the bottomof FIG. 1, at time intervals, respectively, Tm1 and Tm2:

-   -   The tx_(a) transaction is counted as occurring in the time        interval Tm1:Tm2, although a part of the execution time (and        resource usage) occurs in the previous (Tm0:Tm1) time interval.    -   The tx_(b) transaction is not counted in the time interval        Tm1:Tm2, as it finishes after Tm2—yet the tx_(b) transaction        uses resources in the Tm1:Tm2 time interval.

Thus with typical transaction counting methodology, transactions areascribed solely to a given interval, even though computer resources inthe preceding interval were used. Moreover, the reverse situation isalso possible. That is, transactions that finish in the next intervalare not counted in the present interval.

Therefore, there is a basic inaccuracy in the data collected—while themeasurements of, say, processor utilisation is accurate (to clockresolution) for a given interval, the transactions counts are notaccurate. This inaccuracy is inherent in the way computer systems behaveand are measured. Transactions arrive randomly from multipleusers/sources. Therefore it is not possible (in principle) to ensurethat transaction execution will never cross the interval boundary.Regardless of how the time interval between measurements is chosen,there will always (statistically) be transactions that begin theirexecution in one time interval and finish in another time interval.

The inaccuracy of resource usage estimates is high if many transactionsare split between intervals—that is, when the time intervals are shortand transaction processing time is long.

Therefore this inaccuracy increases the error of the estimate ofresource usage by each transaction type. The original method will yieldsatisfactory results if a statistically significant number of samples isgathered—i.e. when enough data is collected. However, improving accuracyis important for several reasons:

Firstly, it would enable an operator, during system testing, to obtainmore accurate data from shorter test runs (thus saving computer andtime) which can then be translated into more accurate estimates ofresource time usage.

Secondly, in production systems it enables “fine-grain” estimates oftransaction resource usage, within shorter periods of time. Note that inproduction systems there is only a very limited possibility of extendingthe measurement time period. When peak hour traffic is measured, and thepeak lasts, say, one hour, then this is the only data available, andmaximum benefit must be delivered from this limited data set.

This improvement preferably enables the use of small time intervals(thereby providing more accurate measurements) by reducing the effect ofboundary conditions (split transactions).

In order to obtain an estimate of system resource usage for eachtransaction type, the method disclosed in the earlier application(namely PCT 09/110,000, filed before the United States Patent andTrademark Office on Mar. 14, 2002) can be used. The method disclosed inPCT 09/110,000 will now be briefly discussed.

The number of transactions of each type executed in the time period arecollected in the log, as is the processor utilisation. Sample raw datais shown in Table II below: TABLE II Sample CPU utilisation data andtransaction counts Time CPU_util NnewOrder NStockLevel nDelivery10:35:10 0.985 3 2 1 10:35:19 0.743 5 3 4 10:35:31 0.650 8 3 6 10:35:400.344 10 4 7

Table II is, in effect, an overdetermined system of equations (that is,a system of simultaneous equations that contain sufficient informationto be solved by application of the appropriate methodology), in the formA*X=B, where B represents the product of the first and second columns ofthe table. The product of the first and second columns of the tableprovides a measure of the time (in milliseconds) taken by the CPU toprocess the transactions listed in the corresponding row. The matrix Arepresents a matrix comprising the remaining columns of the table. Thatis, matrix A contains the number of transactions processed within agiven period of time, the transactions being grouped by process type.The matrix A, as derived from the data presented in Table II, is shownin table III. TABLE III The matrix ‘A’, as derived from the data givenin Table II $A = \begin{bmatrix}3 & 2 & 1 \\5 & 3 & 4 \\8 & 3 & 6 \\10 & 4 & 7\end{bmatrix}$

The vector X represents a vector of coefficients giving the usage foreach transaction type.

This overdetermined set of equations may be solved by the standardlinear least squares solution:X=(A ^(T) *A)−*(A ^(T) *B)

The linear least squares method solution embodied in the above equationis a well known method which is described in many undergraduate textbooks. See, for example, [Johnson et al “Applied MultivariateStatistical Analysis” 3rd ed Practice Hall].

A solution to a system of equations in accordance with the above methodis disclosed in PCT 09/110,000, filed Mar. 14, 2002 in the United StatesPatent Office. An example of the method disclosed in PCT 09/110,000 isoutlined below.

In accordance with an example given in PCT 09/110,000, a sample of datagathered (i.e. the first transaction data) is shown below in table IV.TABLE IV Sample transaction and time interval data CPU[ms] Tx1 Tx2 Tx3195.417 4 2 2 261.513 6 3 1 31.6187 3 2 0 186.385 0 3 1 101.492 6 2 079.3373 0 3 1 340.892 5 3 1 245.999 2 3 0 123.910 1 0 2 50.4557 2 2 0

The matrix A is denoted by columns tx1, tx2 and tx3 of table IV.$A = \begin{bmatrix}4 & 3 & 2 \\6 & 3 & 1 \\3 & 2 & 0 \\0 & 3 & 1 \\6 & 2 & 0 \\0 & 3 & 1 \\5 & 3 & 1 \\2 & 3 & 0 \\1 & 0 & 2 \\2 & 2 & 0\end{bmatrix}$

Matrix B represents the first column of the table (that is, the columnmarked “CPU”). $B = \begin{bmatrix}195.417 \\261.513 \\031.6187 \\186.385 \\101.492 \\079.3373 \\340.892 \\245.999 \\123.91 \\050.4557\end{bmatrix}$

Therefore, substituting into the standard linear leased squared solutionwe obtain the following equation: $X = {\left( {\begin{bmatrix}4 & 3 & 2 \\6 & 3 & 1 \\3 & 2 & 0 \\0 & 3 & 1 \\6 & 2 & 0 \\0 & 3 & 1 \\5 & 3 & 1 \\2 & 3 & 0 \\1 & 0 & 2 \\2 & 2 & 0\end{bmatrix}^{T}*\begin{bmatrix}4 & 3 & 2 \\6 & 3 & 1 \\3 & 2 & 0 \\0 & 3 & 1 \\6 & 2 & 0 \\0 & 3 & 1 \\5 & 3 & 1 \\2 & 3 & 0 \\1 & 0 & 2 \\2 & 2 & 0\end{bmatrix}} \right)^{- 1}*\left( {\begin{bmatrix}4 & 3 & 2 \\6 & 3 & 1 \\3 & 2 & 0 \\0 & 3 & 1 \\6 & 2 & 0 \\0 & 3 & 1 \\5 & 3 & 1 \\2 & 3 & 0 \\1 & 0 & 2 \\2 & 2 & 0\end{bmatrix}*\begin{bmatrix}195.417 \\261.513 \\31.6187 \\186.385 \\101.492 \\79.3373 \\340.892 \\245.999 \\123.91 \\50.4557\end{bmatrix}} \right)}$

Solving this equation, we find that the values for X are,X={13.0585, 39.2245, 50.4133},suggesting that the processor usage for type 1 processes isapproximately 13 ms, for type 2 processes the value is approximately 39ms, and for type 3 processes the value is approximately 50 ms.

The present applicants have determined that the accuracy of the estimateobtained by the method outlined in PCT 09/110,000 may be improved by thecollection of more accurate data with regard to the time intervals inwhich a transaction was executed.

To increase the accuracy of the estimate of CPU usage/transaction countratio, the applicant has determined that it is necessary to take intoaccount the timeline of transactions that cross the interval boundary onboth ends of the time interval.

A system in accordance with an embodiment of the present invention isillustrated in FIG. 1.

There is shown a computing system 1 on which runs an operating system 2,and optionally other third party software applications 3.

An embodiment of the present invention 4, comprises a data gatheringmeans 5 which interacts with either the operating system and/or thethird party applications to gather transaction process data and rawsystem resource utilisation data.

The data gathering means may be implemented by appropriatesoftware/hardware or by any convenient means known to the skilled personin the art, to collect data as required by the following description ofan embodiment of the present invention.

The system also provides a further data gathering means 6, whichcollects further transaction data. This data is processed by aprocessing means 7 to provide a processed transaction count data 8 asoutput data. It will be understood that the further data gathering meansmay also be implemented by appropriate software/hardware or by anyconvenient means known to the skilled person.

In a first embodiment, there is provided a method for improving theaccuracy of the estimation of resource usage per transaction type forsystems in production—dubbed the ‘quick method’.

This method comprises:

-   -   Writing to the application log, at the end of each time        interval, further transaction data comprising the number of        transactions of each type in processing at the moment of the        snapshot.    -   Assuming that for each transaction in processing at the moment        of the snapshot used, approximately half of the resources are        expended in the first time interval and the other half of the        resources are expended in the following time interval.    -   Adjusting the transaction count in both time intervals by a        value of 0.5.    -   Applying the least squares method to the augmented count values,        in accordance with the earlier apparatus, to estimate resource        usage per transaction type.

In the present example, the further transaction data is collected by theuse of a cumulative counter. A cumulative counter represents a commonpractice in “real world” situations, since cumulative counters aresimple to implement and run on computing systems.

The application log contains the total count of transactions thatcompleted execution at each interval (ie. nPayment and nStockLev). Forexample, at clock interval 05, the transaction “nPayment” has occurred(and completed execution) 2 times. See table V below. In addition,transactions of each type in processing while the snapshot is taken (ie.pgpayment and pgStockLev) are also collected. TABLE V Sample applicationlog Clock CPU nPayment NStockLev pgPayment PgStockLev 05 0.79 2 3 1 0 100.19 3 4 0 1 15 0.38 7 2 1 1For the time interval finishing at 10 clock units, the values 3, 4(corresponding to the total number of Payment and StockLeveltransactions executed in the given time interval) are used in the systemof equations in accordance with the applicants earlier patentapplication, namely PCT 09/110,000. Analysing the transactions inprogress, there is one StockLevel transaction active at the time of thesnapshot (pgStockLev at time 10 is 1).

Therefore, the StockLevel transaction count becomes 4+0.5=4.5.

There is no Payment transactions active at time 10, but there was oneactive at the end of the previous time interval (pgpayment at time 05 is1). Therefore, one of the three Payment transactions executed in thistime period began execution in the previous time period. So, theadjusted Payment transaction count is 3−0.5=2.5 Overall theseadjustments change the original equation from5*0.19=3*CPUPayment+4*CPUStockLevelto5*0.19=2.5*CPUPayment+4.5*CPUStockLevelApplying the above considerations to all the rows in the table, anadjusted and more accurate system of equations is obtained, which issolved, in one embodiment, by the linear least squares method. Ingeneral terms the rules for adjusting transaction counts for eachtransaction type, in accordance with the first embodiment, are asfollows:

-   -   Add 0.5 to transaction count for each transaction in progress        during this time interval.    -   Subtract 0.5 from transaction count for each transaction in        progress during the previous time interval.

This method requires collection of additional data—for each transactiontype it is necessary to count the number of transactions in progress. Inmost systems such data can be collected easily and with a minimal runtime overhead.

Therefore, this method is applicable to production systems, under normalworking conditions.

This method preferably improves the accuracy of the resource useestimates, but it is assumed that the transaction in progress is evenlydivided between two time periods. This is statistically true over alarge sample, but for any given time period a transaction might be splitin any proportion between the first and the second time. However, thismethod is attractive for systems that operate under high loadconditions, as the method is simple to implement and only requires aminimal amount of computing resources.

In a second embodiment, there is provided a method for improving theaccuracy of the estimation of computer resource usage particularlysuited for systems in testing—dubbed the “full method”.

This method comprises:

-   -   Writing to the application log, for each executed transaction,        the following further transaction count data: transaction type,        start time and finish time.    -   After each test run, the further transaction count data is        collated with snapshot times to obtain precise values of the        portion of each transaction executed in each time interval.    -   The precise values derived are employed to adjust transaction        counters.    -   The adjusted equations are solved using the same method employed        in the first embodiment, preferably by using the linear least        squares method.        For example, if it is known that the transaction tx_(a) (from        FIG. 2) started at 10:30:09 and finished at 10:30:13, then only        1 second of the transaction was executed in the previous time        period (before 10:30:10) and 3 seconds were executed in the time        period after 10:30:10. Therefore, the appropriate transaction        counters can be adjusted by 0.25 and 0.75. (In practice the        timings are collected and collated at millisecond level and much        more precise adjustments of transaction counters can occur).

This approach provides the best possible accuracy, but it is costly interms of computer resources. The logging of start-finish time of eachtransaction is required, which places run time overheads on the loggingsystem (such as the I/O and CPU). Such data collation requires asignificant amount of disk space, processor time and computer memory,which presently makes this method practical only for off-line analysis.This method may be primarily used in system benchmarking and/or testing,as it is unlikely that system administrators would allow such overheadin production systems on a routine basis.

Statistically this method provides a better approximation than the quickmethod.

In a third embodiment, there is provided a method for improving theaccuracy of estimation of resource usage for systems in testing and inproduction—dubbed the “intermediate” method

The third approach to improve the accuracy of transaction counts isdifferent from the “full method” and the “quick method”—it uses adifferent approach to estimating the effects of transactions inprocessing at the interval time. Operationally and implementationallythe cost of the intermediate method falls between the ‘full’ and the‘quick’ methods. The accuracy of the intermediate method also fallsbetween the accuracy of the ‘full’ and ‘quick’ methods. The intermediatemethod can be used for systems in production, though some systemadministrators would probably not use it on a routine basis.

The intermediate method comprises:

Collecting, for each transaction, further transaction count datacomprising a data set of the start time of the transaction (in memory),and the average response time (in memory).

-   -   At the snapshot time, the average time used by the transactions        in processing for each transaction type is computed.    -   The current average response time for this transaction type is        determined.    -   The two values are divided to obtain the fraction of the        transaction already executed.        This method requires fewer computer resources than the full        method since:    -   It avoids collation of individual transaction response times        with snapshot times, and hence it is uses fewer computing system        resources than the full method.    -   It avoids logging each transaction start time to the file (log),        which further reduces data collection overhead—especially in the        input/output subsystem.    -   It requires only keeping in memory the start time of each        transaction in progress (ie. unfinished transactions)—which is        usually a small number, which requires only several kilobytes of        memory.

The intermediate method can be used for systems in production and iseffective enough for systems in testing. The intermediate methodprovides a better approximation of real transaction counts than thequick method but is not as accurate as the full method. The source ofinaccuracy resides in the fact that the response times at the moment ofthe snapshot are only an approximation of response times of transactionsin progress (while the full method uses actual rather than averageresponse times).

It will be understood that whilst an embodiment of the present inventionmay be applied to the invention disclosed in PCT 09/110,000, filed Mar.14, 2002, in the United States Patent Office, the present invention hasbroader application and may be applied to preferably improve theaccuracy of any suitable resource usage estimation method.

In the abovementioned embodiments of the present invention, raw data isobtained from an application that is integral to a contemporary computeroperating system, but it is to be understood that the data may beobtained in any appropriate way. For example, data may be obtained froma facility that is integral to the operating system, from a facilitythat is integral to an application residing on a computing system, oralternatively the data collection process may be a facility providedintegrally with an embodiment of the present invention. Manycontemporary operating systems allow a user to produce a “log” whichcontains information regarding the utilisation of one or more hardwareresources.

It will be further understood that the data may be collected in adifferent form from the procedure outlined in the above examples. Forexample, it may be possible to collect data directly from the operatingsystem, or directly from a hardware monitor.

Modifications and variations as would be apparent to a skilled addresseeare deemed to be within the scope of the present invention.

1. A method of improving the accuracy of an estimate of computing systemresource usage, comprising the steps of, obtaining utilisation data of asystem resource, obtaining first transaction count data, wherein thefirst transaction count data provides an indication of the number oftransactions executed in a given time interval, obtaining furthertransaction count data, wherein the further transaction count datacontains additional information relating to the execution time of atransaction, and processing the transaction count data and the furthertransaction count data, wherein the processed data provides an improvedestimate of the number of transactions executed during a given timeinterval.
 2. A method in accordance with claim 1, wherein the furthertransaction count data comprises a data set containing a count of thetotal number of transactions that have not finished execution within agiven time interval.
 3. A method in accordance with claim 1, wherein thefurther transaction count data comprises a data set containing the starttime and finish time for each transaction executed.
 4. A method inaccordance with claim 3, wherein the data set is processed to determinea proportion of time expended by a transaction within the given timeinterval and an adjacent time interval.
 5. A method in accordance withclaim 2, wherein processing includes the step of allocating the count ofthe total number of transactions, by an appropriate proportion, betweenan adjacent time interval and the given time interval.
 6. A method inaccordance with claim 5, wherein the appropriate proportion is 0.5.
 7. Amethod in accordance with claim 1, wherein the further transaction datacomprises a data set obtained by calculating the average transactionprocessing time for a given transaction type, and using the averagetransaction processing time to derive an estimate of the transactiontime to be allocated to an individual transaction within a given timeinterval.
 8. A method in accordance with any one of the precedingclaims, wherein the method comprises the further step of applying amathematical model to the estimate of the number of transactions toprovide an estimate of resource usage for individual transaction typeswithin the computing environment.
 9. A computing system arranged tofacilitate the estimation of resource usage within a computerenvironment, comprising a data gathering means arranged to obtainutilisation data of a computer resource and first transaction countdata, wherein the first transaction count data provides an indication ofthe number of transactions executed in a given time interval, furtherdata gathering means arranged to gather further transaction count data,wherein the further transaction count data contains additionalinformation with regard to the execution time of a transaction, andprocessing means arranged to process the first transaction count dataand the further transaction count data, whereby the processed dataprovides an improved estimate of the number of transactions executedduring a given time interval.
 10. A system in accordance with claim 9,wherein the further data gathering means is arranged to obtain countdata comprising the total number of transactions that have not finishedexecution within a given time interval.
 11. A system in accordance withclaim 9, wherein the further data gathering means is arranged to log thestart time and finish time for each transaction.
 12. A system inaccordance with claim 11, wherein the processing means is arranged toprocess the data set to determine a proportion of time expended by atransaction within the given time interval and an adjacent timeinterval.
 13. A system in accordance with claim 10, wherein theprocessing means is arranged to allocate the count of the total numberof transactions, by an appropriate proportion, between an immediatelypreceding time interval and the given time interval.
 14. A method inaccordance with claim 13, wherein the appropriate proportion is 0.5. 15.A system in accordance with claim 9, further comprising calculationmeans, arranged to calculate the average transaction processing time fora given transaction type, and further calculate an estimate of thetransaction time to be allocated to an individual transaction within agiven time interval.
 16. A computer program arranged, when loaded on acomputing system, to implement the method of any one of claims 1 to 8.17. A computer readable medium providing a computer program inaccordance with claim 16.